The feature extraction for classifying words on social media with the Naïve Bayes algorithm

نویسندگان

چکیده

To classify Naïve Bayes classification (NBC), however, it is necessary to have a previous pre-processing and feature extraction. Generally, eliminates unnecessary words while extraction processes these words. This paper focuses on in which calculations searches are used by applying word2vec frequency using term frequency-Inverse document (TF-IDF). The process of classifying Twitter with 1734 tweets defined as weight the calculation TF-IDF that often come out tweet, value decreases vice versa. Following achievement word carried test data, yielding an accuracy 88.8% Slack category tweet verb 78.79%. It can be concluded data form available twitter classified those refer slack verbs fairly good level accuracy. so manifests from habit social media user.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

the algorithm for solving the inverse numerical range problem

برد عددی ماتریس مربعی a را با w(a) نشان داده و به این صورت تعریف می کنیم w(a)={x8ax:x ?s1} ، که در آن s1 گوی واحد است. در سال 2009، راسل کاردن مساله برد عددی معکوس را به این صورت مطرح کرده است : برای نقطه z?w(a)، بردار x?s1 را به گونه ای می یابیم که z=x*ax، در این پایان نامه ، الگوریتمی برای حل مساله برد عددی معکوس ارانه می دهیم.

15 صفحه اول

The Impact of Feature Extraction on the Performance of a Classifier: kNN, Naïve Bayes and C4.5

“The curse of dimensionality” is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity and the classification error in high dimensions. In this paper, different feature extraction techniques as means of (1) dimensionality reduction, and (2) constructive induction are analyzed with respect to the performance of a classifier. Three commonly used class...

متن کامل

A Naïve Bayes Approach to Classifying Topics in Suicide Notes

The authors present a system developed for the 2011 i2b2 Challenge on Sentiment Classification, whose aim was to automatically classify sentences in suicide notes using a scheme of 15 topics, mostly emotions. The system combines machine learning with a rule-based methodology. The features used to represent a problem were based on lexico-semantic properties of individual words in addition to reg...

متن کامل

“the effect of risk aversion on the demand for life insurance: the case of iranian life insurance market”

abstract: about 60% of total premium of insurance industry is pertained?to life policies in the world; while the life insurance total premium in iran is less than 6% of total premium in insurance industry in 2008 (sigma, no 3/2009). among the reasons that discourage the life insurance industry is the problem of adverse selection. adverse selection theory describes a situation where the inf...

15 صفحه اول

Bug Classification: Feature Extraction and Comparison of Event Model using Naïve Bayes Approach

In software industries, individuals at different levels from customer to an engineer apply diverse mechanisms to detect to which class a particular bug should be allocated. Sometimes while a simple search in Internet might help, in many other cases a lot of effort is spent in analyzing the bug report to classify the bug. So there is a great need of a structured mining algorithm where given a cr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IAES International Journal of Artificial Intelligence

سال: 2022

ISSN: ['2089-4872', '2252-8938']

DOI: https://doi.org/10.11591/ijai.v11.i3.pp1041-1048